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Abstract. In this paper, we introduce a class of local indicators that enable to compute effi- 
ciently optimal transport plans associated to arbitrary weighted distributions of N demands and M 
supplies in M in the case where the cost function is concave. Indeed, whereas this problem can be 
solved linearly when the cost is a convex function of the distance on the line (or more generally when 
the cost matrix between points is a Monge matrix), to the best of our knowledge, no simple solution 
has been proposed for concave costs, which are more realistic in many applications, especially in 
economic situations. The problem we consider may be unbalanced, in the sense that the weight of 
all the supplies might be larger than the weight of all the demands. We show how to use the local 
indicators hierarchically to solve the transportation problem for concave costs on the line. 

1. Introduction. The origins of optimal transportation go back to the late eigh- 
teen century, when Monge '16| published his Memoire sur la theorie des deblais et des 
remblais (1781). The problem, which was rediscovered and further studied by Kan- 
torovich in the 1940's, can be described in the following way. Given two probability 
distributions fi and v on X and c a measurable cost function on X x X, find a joint 
probability measure tt on X x X with marginals and v and which minimizes the 
transportation cost 

c{x,y)dTT{x,y). (1.1) 

XxX 

Probability measures tt with marginals /i and v are called transport plans. A transport 
plan that minimizes the cost (11.11) is said to be optimal. 

When the measures /i and v are discrete (linear combinations of Dirac masses), 
the problem can be recast as finite linear programming. For > 1, consider two 
discrete distributions of mass, or histograms, given on M^: {{pi,Si)}, which represents 
"supplies" at locations pi with weights Si and {{qj, dj)}, which represents "demands" 
at locations qj with weights dj (notation from [T]) and assume that all values of Sj 
and dj are positive reals with S := Si and D := dj. The problem consists in 
minimizing the transport cost 

where 7^ is the amount of mass going from pi to qj, subject to the conditions 

lij > 0, ^Jij < Sj, ^Itj < dj, ^7jj = inin(S', £>). (1.3) 

j i i,j 
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Fig. 1.1. On the left: optimal plan associated to a concave cost. On the right: optimal plan 
associated to a convex cost. Supplies are represented by red points and demands by blue crosses. 

The matrix of values 7 — {'^ij} is still called transport plan. When S — D, the problem 
is said to be balanced and is only a reformulation of (|l.ip for discrete measures. 
When S =/= D, the problem is said to be unbalanced. The cases S < D and S > D can 
be treated in the same way. This paper deals with balanced problems and unbalanced 
problems of the form S > D. 

In the unitary case, i.e. when all the masses Si and dj are equal to a single value 
V, it turns out that if 7 is optimal, for all i,j, jij € {0, u} and for all j there exists 
only one i such that jij — v (each demand receives all the mass from one supply) . In 
the balanced case, the matrix 7 is thus a permutation matrix up to the factor v. In 
the unbalanced case, the permutation matrix is padded with some zero rows. As a 
consequence, the balanced case boils down to an assignment problem, known as the 
linear sum assignment problem. Such problems have been thoroughly studied by the 
combinatorial optimization community ^5j. 

Optimal transportation problems appear in many fields, such as economy or 
physics for instance, see e.g. [4j [8l [13]. In economic examples optimal transport 
is often related to the field of logistic where supplies are furnished by producers at 
specific places pi and in specific quantities di, while demands corresponds to con- 
sumers locations and needs. Depending on the application, various cost functions c 
can be used. For instance, concave functions of the distance appear as more realistic 
cost functions in many economic situations. Indeed, as underlined by McCann [15], a 
concave cost ^''translates into an economy of scale for longer trips and may encourage 
cross-hauling.^^ 

During the last decades, many authors have taken interest in the study of exis- 
tence, uniqueness and properties of optimal plans [2l lll[ [T4] . with a specific interest 
for convex costs, i.e. costs c that can be written as convex functions of the distance 
on the line. Detailed descriptions of these results can be found in the books [25] [26], 
One case of particular interest is the one-dimensional case, which, when c is a convex 
function of the distance on the line, has been completely understood [22 both for 
continuous and discrete settings. Indeed, this problem has an explicit solution that 
does not depend on c (provided that it is convex) and consists in a monotone rear- 
rangement (see Chapter 2.2 of ^25]). In the unitary case, this property can also be 
seen as a consequence of another interesting result, true for any dimension N , which 
says that the linear sum assignement problem is solved by the identical permutation, 
provided that the cost matrix {c{pi, qj))i.j is a Monge matrix^ [5j. Several approaches 
have been proposed to generalize the convex one-dimensional result to the case of the 
circle, where the starting point for the monotone rearrangement is not known, and 
its choice and hence the optimal plan itself, unlike in the case of an interval, do de- 
pend on the cost function c. Most of these approaches concern either the unitary 
case [H UHl [271 ISl [3 or the more general discrete case lO [TTl [1^1 [T51[^ . 
Recently an efficient method has been introduced to tackle this issue in a continu- 



matrix C is said to be a Monge matrix if it satisfies Cij +ci^i < cn +Ckj when i < k and j < /. 
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Fig. 1.2. On the left: solutions associated to the concave cost c{x,y) = \x — y\^'^ , and on the 
right to the cost c{x,y) = \x — y\^'^ ■ Supplies are represented by red points and demands by blue 
crosses. 

ous setting [9 . Unfortunately, these results on the line and the circle do not extend 
to non-convex costs, in particular to concave costs (see Figure [Lll for an example). 
Although it is of broad interest for many applications, few works treat this case (see 
however the important paper [15j ) and computing solutions is far from obvious in gen- 
eral. Indeed, contrary to the convex case on the line, optimal plans strongly depend 
on the choice of the function c. Consider the case of two unitary supplies at positions 
Pi = and p2 = 1.2 and two unitary demands at positions qi = 1 and (72 = 2.2 on 
the line, as drawn on Figure FOl If the cost function is c{x,y) = \x — y]"'^, the left 
solution will be optimal, whereas the other one will be chosen for c(a;, y) = \x — y\^'^ . 
For a convex cost, the left solution would always be chosen. 

In practice, when no analytic solution is given (i.e. most of the time), finding 
optimal plans can be a tedious task. As underlined before, in a discrete setting, the 
problem can be written as a linear programming problem, and optimal plans can 
be constructed numerically by using for instance the simplex method or specialized 
methods such as the auction algorithms [3 and various algorithms for the assignment 
problem (see [5j for details) . However these methods do not take into account essential 
geometric features of the problem, such as the fact that it is one-dimensional or that 
the cost function is concave. 

The goal of this paper is to introduce a class of functions that reveals the local 
structure of optimal transport plans in the one-dimensional case, when the cost c 
is a concave function of the distance. As a by-product, we build an algorithm that 
permits to obtain optimal transport plans in the unitary case in less than O(iV^) 
operations in both balanced and unbalanced cases, where N is the number of points 
under consideration. Once generalized to the non unitary case, the complexity of this 
algorithm becomes 0{N^) in the worst case but turns out to be smaller for "typical" 
problem instances. However, let us insist that our aim is not to compete with recent 
linear assignment algorithms, which may be more interesting in practice, at least 
for balanced problems, but rather to achieve a more complete understanding of the 
internal structure of the assignment problem for concave costs on the line. 

Observe that our algorithm complements the method suggested by McCann [TS] , 
although the approach we follow here is closer to the purely combinatorial approach 
of [1]. The results of this last work, in which the cost c{x, y) = \x — y\ was considered, 
are extended here to the general framework of strictly concave cost functions. (Note 
that the very special case considered in |lj may be also regarded as convex, which 
allows to apply the sorting algorithm on the line or results of [5] on the circle.) 

The paper is organized as follows: Section [2] is devoted to the presentation of 
the optimal transport problem. In Section [3l we focus on transport problems on 
"chains" which are particular cases where demands and supplies are alternated. In 
this framework, we present the main result of the paper, which states that consecutive 
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matching points in the optimal plan can be found thanks to local indicators, indepen- 
dently of other points on the line. Thanks to the low number of evaluations of the 
cost function required to apply the indicators, we derive from this result a rather effi- 
cient algorithm to compute optimal transport plans. We then consider more general 
frameworks, namely general unitary cases in Section |4] and real- valued masses situa- 
tions in Section [5l In Section [6l we conclude with remarks on the implementation of 
our algorithm and show that its complexity scales as 0(N'^) in the worst case. Some 
technical proofs about this last result are given in Appendix. 

2. The optimal transport problem. This paper deals with the problem of 
finding an optimal transport plan in the case where the problem contains possibly 
more supplies than demands and the transport cost is strictly concave: the larger the 
distance to cover is, the less the transport costs per unit distance, while the marginal 
cost (the derivative of the cost function) decreases monotonicaly. 

Consider two integers M, N and two sets of points P — {pi: i = 1, . . . , M} and 
Q =^ {qi: i = 1,...,A'^} in R that represent respectively the supply and demand 
locations. Let Si > be the capacity of ith supply and dj > the capacity of jth 
demand. We suppose that 5* := J^i Si > D :— J^j dj, i.e. that the problem may be 
unbalanced. 

We deal with minimizing the cost 

C{j)^J2<P^'1^^^^^^ (2.1) 

where c{pi,qj) G K+ is the cost resulting from transport of a unit mass between pi 
and qj. The quantity is the amount of mass going from pi to gj, subject for all 
i, J to the conditions 

li3 > 0, ^ 7»j < s,;, ^ l^o ^ dj (2.2) 

i i 

(observe that since D < S, these conditions are equivalent to (|1.3p V We call the case 
S = D balanced and the case S > D unbalanced. Observe that in the latter case the 
total supply is larger that the total demand, and therefore some of the supplies may 
remain underused (^jjij < Si). 

As mentioned in Introduction, an optimal transport problem associated to equal 
masses, i.e. V(i,j) € P x Q, Si — dj = v, reduces actually to an assignment problem, 
where masses cannot be cut. Indeed, it is well known (see Section 2.2 of 5 for a 
proof) that if 7 minimizes the cost ()2.1|) under conditions (j2.2|) . then without loss of 
generality one can assume that ^ij g {0,w} for all i, j, so that the problem can be 
reformulated as finding the minimum of the quantity 

C(o-) = c(Pa-iO),gj), (2.3) 

over all partial maps a: {1, . . . , M} — > {1, . . . , N} whose inverse is injective and 
defined for all 1 < j < N: namely j = a{i) and i ~ cr^^{j) iff 'Jij — 1. This setting is 
the one of Sections [3] and S] 

We focus on the case where the function c involves a strictly concave function as 
stated in the next definition. 

Definition 2.1. The cost function c in ()2.3p is said to be concave if it is defined 
by c{p, q) = g{\p — q\) with p, g € R, where g: M.'^ — > R U {—00} is a strictly concave 
non- decreasing function such that g(0) := lima;_>o gi^) ^ —00. 
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Note that strict concavity of g implies its strict monotonicity. Some examples of 
such costs are given by g(x) = log a; with g{0) — —oo and g{x) = \fx with (7(0) = 0. 
If 5(0) > — cx), we assume without loss of generality that 5(0) = (this changes the 
value of (j2.ip by an amount D g(0) independent of the transport plan). 

In what follows, we denote by 7* a given optimal transport plan between P and 
Q: C(7*) < C{-f) for all 7 satisfying (|2.2p . Observe that if two points pi and qj 
have the same position, then there exists an optimal transport plan 7"* between P 
and Q such that 7*^- = min{si,dj}, i.e. that all mass shared by the two marginal 
measures stays in place 25j. Indeed, suppose that a supply p and a demand q located 
at the same point are not matched together but to some other demand and supply p' 
and q' located at distances x and y respectively. Irrespective of whether g{0) = or 
g{0) = —00, as soon as g is strictly concave, one has 

g{0) + g{x + y) <g{x)+giy) 

for all X, y, which implies that matching p and q is cheaper. Therefore a common point 
of P and Q with unequal values Si and dj may be replaced with a single supply of 
capacity Si — dj , if this quantity is positive, or with a single demand of capacity dj ~ Si. 
In the following, we will therefore assume that common points do not exist, i.e. that 
the sets P and Q are disjoint. 

Another significant feature of concave costs is that trajectories of mass elements 
under an optimal transport plan do not cross each other, as described by the following 
lemma. 

Lemma 2.2 ( "Non-crossing rule" ) . Consider two pairs of points {p,q) and {p',q') 
such that 

c{p, (?) + c{p', q') < c{p', q) + c{p, q'). (2.4) 

Then, the open intervals 

I = (min(p, (7), max(p, g)), /' = (min(p', g'), max(p', (7')) 

are nested, in the sense that the following alternative holds: 

1. either I H I' is empty, 

2. or one of these intervals is a subset of the other. 

This result directly follows from the concavity of the cost function and is often 
referred to as the "non-crossing rule" [1] [15] . The proof is based on the same ideas 
as used in p3]. Essentially, the case p < q' < q < p' and the similar case with p's 
and g's interchanged are ruled out in view of (|2.4|) by monotonicity of g, whereas the 
case p < p' < q < q' and the symmetrical one are ruled out by the strict concavity 
of g. 

In the unbalanced case, some supplies may lie outside all nested segments. 

Definition 2.3. A point r € PUQ is said to be exposed in the transport plan 7 
if r ^ (mhi{pi,qj),Tnax{pi,qj)) whenever ^ij > 0. 

A sufficient condition for a supply to be exposed is given in the next statement. 

Lemma 2.4. In the unbalanced case all underused supplies are exposed in an 
optimal transport plan. 

Indeed, should an underused supply pi belong to the interval between and qj„ 
such that jioj„ > 0, the amount of mass equal to min{7i(,j(, , Sj — X^jTijl could be 
remapped to go to qjg from pi rather than pi^ , thus reducing the total cost of transport 
because of the strict monotonicity of the function g. 
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3. Transport plans on chains. In this section, we focus on tlie particular case 
of chains, that are situations where all the masses are equal and alternated on the 
line, i.e. where P and Q satisfy M — N {balanced case) and 

pi < qi < ■ ■ ■ < Pi < Qi < pi+i < qi+i < ■ ■ ■ < pn < Qn, (3.1) 

OT M = N + 1 {unbalanced case) and 

pi < qi < ■ ■ ■ < Pi < Qi < Pi+i < Qi+i < ■ ■ ■ < pn < Qn < PN+i- (3.2) 

In these cases the set PUQ is called balanced chain and unbalanced chain respectively. 
Sections coming after this one will aim at extenting our results to more general cases. 

Recall that in such a framework, optimal transport problems are actually as- 
signment problems, where masses cannot be cut: Optimal transports plan are then 
described by permutations. 

3.1. Main result. Thanks to the non-crossing rule, one knows that in any op- 
timal transport plan there exists at least two consecutive points {pi,qi) or {qi,pi^i) 
that are matched. Starting from this remark, we take advantage of the structure of 
a chain to introduce a class of indicators that enable to detect a priori such pairs of 
points. 

Definition 3.1 (Local Matching Indicators of order k). Given < k < N — 1, 
consider 2k + 2 consecutive points in a chain. If the first point is a supply pi, define 

fc-i k 
Iki^) = c(k, q^+k) + c{p^+j+i,qi+.j) - ^ c{p^+j,q^+j), 
3=0 j=o 

else denote the first point qi and define 

k k 
= c(K+fc+i, (?,) + ^ c{p,+j,q^+j) - ^ c{p^+j+i,q,+j). 

This definition is schematically depicted in Figure [01 in the case k = 2. 



— o< o< 

Fig. 3.1. Schematic representation of an indicator of order 2. 

Note that in the first alternative of this definition, we have necessarily 1 < fc < 
N— \,l<i<N — k. In the second alternative, we have necessarily 1 < A: < iV — 2 
and 1 < i < N — k — \ in the balanced case and 1 < fc < iV — 1 and l<z<iV — fcin 
the unbalanced case. The interest of these functions lies in the next result. 

Theorem 3.2 (Negative Local Matching Indicators of order k). Let fco G N with 
^ ^ ka < N — 1 and ig G N, such that 1 < io < N ~ kg. In the unbalanced case, 
suppose in addition that g is strictly monotone. 

Assume that 

1- Iki^) > for k = 1, . . . ,ko - 1, io < i < io + ko - k, 
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Fig. 3.2. Schematic representation of the result of Theorem \3.2\ in the case fco = 1. 



2. > for k ^ 1, . . . ,ko - 1, ia < i' < io + ko - k - 1, (resp. io < i' < 
io + ko — k in the unbalanced case ) 

3. Il{i,)<Q. 

Then any permutation a associated to an optimal transport plan satisfies a{i) = 
i — 1 for i = io + 1, . . . ,iQ + ko- 

If the third condition is replaced by I^^iio) < (with the same bounds on ko and 
io in the unbalanced case, and with 1 < ko < N ~ 2 and 1 < io < N ~ ko ~ 1 in 
the balanced case), then any permutation a associated to an optimal transport plan 
satisfies a{i) = i for i = + 1, . . . , io + fco- 

This result is represented in broad outline in Figure [3?2] For practical purposes, 
these indicators allow to find pairs of neighbors that are matched in an optimal trans- 
port plan. It also shows that the usual c-cyclical monotonicity condition of optimal- 
ity (see [24 and 10 ) can be improved in the concave case: only specific subsets have 
to be tested to check the optimality of a transport plan. 

3.2. Algorithm. We now derive from the previous theorem a simple algorithm 
to compute an optimal transport plan in the case of chains. For the sake of simplicity, 
we only consider the balanced case. The unbalanced case can be treated in the same 
way. 

The local matching indicators defined in Definition 13.11 can be used recursively 
to compute optimal transport plans for chains. The elementary step of this approach 
consists in finding a negative indicator satisfying the hypothesis of Theorem l3.2l in the 
list of supplies and demands. Once this step is achieved, the inner points involved in 
this indicator are matched as prescribed in Theorem 13.21 and removed from the list. 

As for the research, it is performed by means of a loop that updates iteratively 
the pair (fco, io), following the lexicographic sorting (over admissible pairs), as long as 
positive indicators are found. In this way, the hypothesis of Theorem 13.21 are satisfied 
when a negative indicator is found. At the beginning of the algorithm or when a 
negative indicator is found, the set of admissible pairs is updated and the current pair 
is set to (1, 1). 

We denote by cr* the map for which this minimum is attained. 
Algorithm 1. 

• SetV = {pi,...,p,v,gi,...,9iv}, = {l,.-.,^}, ^« = and 

k = 1; 

• while V ^% and k <N 

1. compute I^{i) and Il{i') for i = 1, . . . , A — fc and i' = 1, . . . , A — fc — 1; 
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2. define 



I'k ={io,l<io< N-k, IP {io)<0}, 

It = {^o, l<^o<N-k-l, I^ito) < 0}; 

3. ijll = % and ll = 0, then set k ^ k + 1; 

4. else do 

— for all io in and for i — iq + 1, . . . ,iQ + k, do 

* define a* i£P)=ej_i^ 

* remove {piv,qii ^} from V, 

* remove £f and if from P and respectively; 

— for all Iq in and for j = Jq + 1, . . . , Iq + fc, do 

* define cr*(^f ) = ij, 

* remove {pgp, qgi} from V , 

* remove and if from P and i"^ respectively; 

— set N ^ \Card{V), and rename the points in V such that 

V = {pi, . . . ,PN,qi, ■ ■ ■,qN}, 

pi < qi < ■ ■ ■ < Pi < qi < Pi+i < qi+i < ■ ■ ■ < Pn < qN; 

— set k — 1; 

• if k ^ N ~ 1, for i ^ 1, . . . , N set a*{£P) = If. 

A first alternative algorithm consists in testing the sign of each /f («) and as 
soon as they have been computed and in removing the corresponding pairs of points 
whenever a negative value is found. A second alternative consists in carrying out 
the research of pairs (iq, fco) that satisfy the hypothesis of Theorem 13.21 following the 
lexicographic order associated to the counter (io + 2fco, feg). 

3.3. Proof of Theorem [3721 

3.3.1. Technical results. This section aims at introducing technical results 
that are required to prove Theorem 13.21 We keep the notations introduced therein. 
We start with a basic result that plays a significant role in the proof of Theorem 13.21 
As it was the case for the non-crossing rule (Lemma 12. 2p . the concavity of the cost 
function is an essential assumption of this lemma. 

Lemma 3.3. We keep the previous notations. For x^y G define 

fe-1 fe-1 
"Pk^ii^^ y)^9{x + y + q^+k -Pi) + ^ c.{p,+j+i,q,+j) - g{x) - g{y) - ^ c{p^+j,q,+j), 

3=0 i=i 

for fc, i e N, such that 1 < k < N - 1 and 1 < i < N - k, and 

k k-1 

'PkA^' ^ 9(x + y+Pi+k+i-qt)+^c{pi+j,qi+j)-g{x)-g{y)~^c{pi+j+i,qi+j), 

for fc, i G N, such that 1 < k < N — 2 and l<i<N^k— 1 in the balanced case and 
1 < k < N — 1 and 1 < i < N ^ k in the unbalanced case. Both functions ip^ j^{x,y) 
and ifl j(x,y) are decreasing with respect to each of their two variables. 
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This lemma is a direct consequence of the concavity of the function g. 

To deal with unbalanced chains, we need two additional lemmas, one of them 
requiring that g is strictly monotone. The first result we need is usually referred as 
"The rule of three" in the literature 15 . 

Lemma 3.4 ("rule of three"). Suppose that g is strictly monotone. Given p < 
g < p' < g' G M, suppose that 

c{p, q') + c(p', q) < cip, q) + c{p' , q'). (3.3) 

Then \p' — q\ < min(|p — q\, \p' — q'\). 

Proof: Since g is increasing and since |p — ^'l > niax(|]3 — q\,\p' — q'\), Inequality p.3p 
implies that c{p',q) < min(c(p, q), c(p', g')). The result follows the fact that g is 
strictly increasing. □ 

We shall also make use of the following generalization. 

Lemma 3.5. Suppose that g is strictly monotone. Under Hypothesis and 
of Theorem \3. 2\ the following inequalities are satisfied 

\qi -Pi+i\ < mm{\pig - qi\, \pi+i - gio+fcj), Vi S {iq, . . . , iq + fco - !}■ 

If Hypothesis\^is replaced by I^^iio) < and Hypothesis (0i holds, one finds 

\p^ - q^\ < mm{\q,>^ - p^\, \q^ - Pt'^+ko+i\), Vi £ {i'o + I, ■ ■ . ,i'o + fco}- 

Proof: Let i G {io, . . . , io + fco — 1}. Hypothesis (121) of Theorem 13.21 implies that 

io+fco io+fco — 1 

c{p^+l,q^) + c{p^o,q^„+ko) < ^c{pj,qj)- ^ c{pJ+l,qj)+c{p^+l,q^). 

j=io j=io 

Now, because of Hypothesis (H]), we have /f_j^(io) > and /j^_|_;.^_j_i(i + 1) > 0, 
which means that 



c{Pj,qj) < c{p,^,q^) + c{pj+i,qj) 



and 



Thus, 



io+ka io + fco-l 

XI c(Pj,gi) < c(K+i,gj„+feJ + Y c(pj+i,(?j). 



c{pi+i,qi) + c(p,o, g^o+Zco) < c(pio, (?i) + c{pi+i,qi„+ko). 

We conclude with the rule of three. The result in the case /^^ (iq) < can be deduced 
by symmetry. □ 



Note that in the two previous proofs, the only necessary hypothesis is that the 
cost is a strictly increasing function of the distance. In particular the result also holds 
in the case where the cost function is increasing and convex. 

Lemma 3.6 ("partial sums"). Under Hypothesis (QJ) and (0j of Theorem \3.2[ for 
any i in {iq + 1, . . . , io + fco} o,nd i' in {iq, . . . , iq + fco ^ 1}; the following inequalities 
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are satisfied: 



^ c{pj ,qj)>^ c{pj+i, Qj), (3.4) 



io+fco io+fcfl — 1 

c{Pj.qo)> (3-5) 
j-i'+l j-i' 

If Hypothesis\^is replaced by I'^g{i'o) < and Hypothesis (0) holds, one finds 

i—l i 

^c(pj+i,gj)> ^ c(pj,(?j), (3.6) 

and 

io+fco io+ko 

^ c(pj+i,gj)> ^ c{pj,qj). (3.7) 

j=i' + l j=j' + l 

Proof: In order to prove inequality (13.41) . remark that since If^ika) < 

i— 1 io + fc"o io + fco 

io + fc-o — 1 io+fco 

for i such that Jo + 1 < * < *o + ^o- Moreover, since If^_^_f.^_^{i) > one has 

i— 1 io+fco — 1 io+fco — 1 

X cfe, 9j) > c(Ko: ^io+fco) + X - - X c(pj+i, gj). 

j=io j=io j=i 

Since g is increasing, this leads to the inequality p.4p . The proof of Equations p.5l - 
13. 7p follows the same path. □ 



We are now in the position to prove our main result. In a first part we focus on 
the balanced case, and then go to the unbalanced case, which requires more efforts. 

3.3.2. The balanced case. Consider the balanced case, i.e., the situation cor- 
responding to p.ip . We focus on the case where I^^iio) < 0. The case Il^ii'o) < 
can be treated the same way. 

The proof consists in proving that Hypothesis ([T]-[3]) of Theorem 13.21 imply that 
neither demand nor supply points located between pi^ and Qio+fco can be matched 
with points located outside this interval, i.e. that the set = {pj , io + 1 < j < 
io + ^o} U {qj, io < j < io -\- ko — 1} is invariant under an optimal transport plan. In 
this case, the result follows from Hypothesis ([TH21). 

Suppose that S'l° is not preserved by an optimal transport plan a*. According 
to the non-crossing rule, three cases can occur: 
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a) There exists ii G N, such that 1 < ii < io and io < cr*{ii) < iq + kg — 1 and there 
exists i'l e N, such that (T*{ii) + 1 < «i < io + and io + ^ ^ ^• 

b) There exists 12 G N, with io + 1 < 12 < io + ^o such that 1 < o'*(i2) < io — 1- 

c) There exists i2 G N, with io + ^o < i2 < such that io < cr*{i2) < io + fco. 
We first prove that Case[aj) cannot occur. 

In Case[aj), one can assume without loss of generality that o'*(ii) is the largest 
index such that 1 < ii < io, io ^ ''■'^(ii) ^ io + ^0 ~ 1 and that i'^ is the smallest index 
such that (7*(ii) + 1 < i'^ < iq + ko, io + ko < cr*{i[) < N . Assume also that we are 
not in Cases |b| or|c}. With such assumptions and because of the non-crossing rule, 
the (possibly empty) subset {pi, cr*{ii) + 1 < i < i'l — 1} U {g,, cr*(ii) + 1 < i < i'l — 1} 
is stable by a*. Because of Hypothesis ID-El), no nesting (i.e. no pair of nested 
matchings) can occur in this subset, and cr*(i) = i for i = c*(ii) + 1, . . . , i'j^ — 1. 

On the other hand, since a* is optimal, one has: 

i'l-l 

c{p^,,qa*{i^)) + c{p^'^,qa*ii[))+ c{pj,qj)<c{pi,,q^*(^,,^))+ c{pj+i,qj). 

j=(T*(ii) + l 

Thanks to Lemma l3.3| one deduces from this last inequality that: 

c{Pto,qa*(n)) + c{Pi\.qio+ko)+ Y < c(p»o>9*o+fco)+ Y c(Pj+i,gj), 

j=o-*(ii) + l J=CT*(il) 

and then; 

(T*(n)-1 io+ko-l 

+ c{pj,qj) <c{p.,„,q^„+ko)+ Y c(Pi+i,gj)- (3.8) 

j=cr* (ii) + l j=io 

According to Hypothesis ([T]), I^* > and (i'l) > 0, so that: 

cr*(ii) CT*(li)-l 
Y ^^Pj ' 9j ) - "^(P'O , 9rT* (il ) ) + Y + 1 ' 9i ) 



io+ko ia+ko — 1 

X] c(Pi,gj) < c(pi'^,gio+feo) + Y c(Pj+l,gj). 



Combining these last inequalities with (j3.8p one finds that: 

io+fco ia+ko — 1 

Y 9j) < c(pio, 9io+feo) + Y "^(Pj+i' 

which contradicts Hypothesis ([3]). 

Let us now prove that Cases |b| and|g) contradict the assumptions. As Cases [b]) 
and[cj) can be treated in the same way, we only consider Caselb]). Without loss of 
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generality, one can assume that ^2 is the smaUest index such that io + 1 ^ *2 ^ *o + 
and cr*(«2) < *o — 1- Because of the non-crossing rule and the fact there are necessarily 
as many demands as supplies between qi^ and pi^ , there exists one and only one index 
i'2 such that iQ < cr*(i2) < 12 — 1 and 1 < ^2 — *o- Consequently, the non-crossing 
rule implies that the (possibly empty) subsets {pi,io + 1 < i < a*{i'2)} U {qi,iQ < 
i < (T*{i'2) - 1} and {pi,(T*{i'2) 1 < z < 22 - 1} U {qt,a*{i'2) -|- 1 < i < i2 - 1} are 
stable by an optimal transport plan. Because of Hypothesis (IlH2]), no nesting can 
occur in these subsets, and cr*(i) = i — 1 for i = iq + 1, . . . ,cr*{i'2) and cr*(i) = i for 
i = cr*{i'2) -I- 1, . . . ,«2 - 1. 

On the other hand, since a* is optimal, one has 

22-1 

cip^2,qa*i^2))+ciP^'^,<la*{^'^))+ c(Pj'<7j-l)+ ^ c{Pj , Qj) 

< c(p,^,g<,*(,,)) + c{pj,qj^i). 

j=io + l 

Thanks to Lemma 13.31 one deduces from this last inequality that: 

0-*(«2) 42-1 

) ) + C{p^„ ,qa*{t'^))+ Y ^(Pj^l3~^)+ Y ^^P^ ' * ) 
j=io+l j=(T*(i^) + l 

<c{p^o,qa^^2))+ (3.9) 
j=io + l 

Because the cost is supposed to be increasing with respect to the distance, one finds 
that c{pio,q^.(i^)) < c{pi^,q^(^i^)), so that ((X^ implies: 

0-* {i'2) 12 -1 12 

ciPto,qaHt:,))+ Y Y c(pj,(7j)< Y c(Pi,gj-i), 

j=io-|-l j=o-*(iJ,) + l j=io-l-l 

and then: 

j=4o+i j=o-(i;;j)+i j=i2+i 

io+fco 

< Y c(p,, 9,-1). (3.10) 

j=io-l-l 

According to Hypothesis ([T]), I^^i' -j^i^iio) ^ 0, so that: 
Combining these last inequalities with p. 101) one finds that: 

io+ko io +kg -l 

Y '^^Pi ' 9j) ^ c(p,o , q,^+ko ) + Y ^^P3+^ ' 
12 



which contradicts Hypothesis 

We have then shown that neither demand nor supply points located between pig 
and Qig^ko+i can be matched with points located outside this interval. The set is 
then stable by an optimal transport plan. According to Hypothesis UHl]), no nesting 
can occur m 5f " . The result follows. □ 



3.3.3. The unbalanced case. We then show that Theorem 13.21 still holds in 
the unbalanced case. We start with the case /^^(jo) < 0. 

Observe first that none of the points , io + 1 ^ J ^ *o + can remain unmatched 
in an optimal transport plan. Indeed, assume on the contrary that there exists i 
in {iQ + 1, ... ,10 + ko} such that pe is unmatched in an optimal transport plan a* . 
Note first that no nesting can occur in S^^ , so that the points in this set can only 
be matched either with their neighbors or with points outside this set. According to 
Lemma 13.61 

e-1 e-1 

j=io j=io 

Therefore we cannot have cr*(i) = i for i = iq, . . . ,£—1: otherwise it would be possible 
to rematch all the points qi in this interval to their right neighbors and reduce the 
cost. Hence, as the point pi is unmatched and, because of Lemma l2.4| exposed, there 
exists m in {io, ...,£— 1} such that (cr*)~^(m) < ip. Choose m to be the greatest 
value of the index satisfying this property. Since no nesting can occur in S^" , we have 
(T*(i) = i for all i in the (possibly empty) interval m + l<i<£ — I. Now, since g is 
an increasing function, 

e-1 e-1 m 

(m) 1 Qrn 

j=m+l j=io j=io 

Using again Equation p.4p of Lemma 13.61 one deduces from this last inequality that 

t-l e-1 rn 

c{p{a*)-^{m),qm) + ^ c(pj,(7j) > c{p^^,q„,) + ^ c{pj+i,qj) - Y c{Pj,qj). 

j=m+l j=io j=io 

It follows from this and from > that 

e-1 m-l 
j=m+l j=io 

e-1 m 

j=7n j=io 

e-1 

In other words, it is cheaper to match each qi, m < i < £ — 1, with its right neighbor 
Pi+i and to exclude P(a*)-^{m) than to match each qi with its neighbor pi and to 
exclude pi. In all cases, the point pi cannot remain unmatched. 
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If the point is matched in the transport plan a* , then we can conclude by the 
already proved first part of Theorem l3.2l that a*(i) = z — 1 for z = zq + I, . . . , io + ^o (ac- 
cording to Lemma 12.41 unmatched points are exposed, the existence of an unmatched 
Pi outside of [pig,qig-^-ko] has no consequence on this result). 

Now, assume that remains unmatched and that there exists m' in {iq, . . . ,iQ + 
fco — 1} such that (cr*)^^ (to') ^ m' + 1. Since Pig is exposed, and since all points of Si^ 
are matched and no nesting can occur in Si^ , there exists m' in {iq, . . . ,ia + ko — 1} 
such that (o'*)^^(m') > io + ko- One can assume without loss of generality that m' is 
the largest index in {zq, • ■ • , *o + ^ 1} satisfying {a*)~^{m') > zq + kg. 

Actually, to' < + fco — 1. Indeed, suppose that m' = io + feg — 1: on the one 
hand, because of Hypothesis HIE]), the rule of three (variant, Lemma [X5)l implies that 
\Pio+ko - qio+ko-i\ < \Pio+ko - 9to+fcol- But on the other hand, since the matchings 
) and (pio+fcQ, gCT*(io+feo)) belong to an optimal transport plan, the 
rule of three (standard version. Lemma [3.4p implies |pio+feo — qig+ka~i\ > \Pia+ko ~ 
Qa*{io+ko)\- Because of the non-crossing rule, cr*(io 4- fco) > *o + ko, hence \pia+k„ — 
qia+k„-i\ > \Pta+ka " 9io+fcol- This provides a contradiction. 

Two cases can now occur: either a*{i) = i for all i in {to' + 1, . . . , zq + ko}, or 
there exists a unique supply pk in {m' -I- 1, . . . , ig + fco} such that (7*(k) > iq + kg. 
This cannot happen for two different supplies in {to' -I- 1, . . . , zg -I- fco}, otherwise there 
would be another demand qi between these supplies such that {a*)~^{£) > io + kg. 

In the first case, thanks to equation p.Sp 



io+ko 



io + fco — 1 




20 +'£0-1 




j=m' 



which contradicts the optimality of a*. 
In the second case, since /^^^(zo) < 0, 




c{Pj,q3) + C{pk,qcr*(k)) > C{qm' ,P{cr*)-Hni')) + c{pk, qa*{k)) 



j—m'-\-l 




io-\-ko 



Now, since /^,„jj,(zo) > and If^ 



io+ko — k 



(fc) > 0, this inequality yields 



fc-i 




+c{pk ,qa*{k)) - c{pk, qzo + ko ) + c(pio J qio+ko 




The two differences that appear in the right-hand side are positive so that 

fc-i 

ciqm',P{a^)-^(rn')) + ^ c{pj , Qj) + c{pk, (^k)) > c{q„* {k),P{a*)-^{m')) 
j—rn'~\-l 

fc-1 

which also contradicts the optimahty of a*. 

By symmetry, the theorem remains vahd in the case where (iq) < instead of 
W<0. " □ 



4. General unitary case. We now focus on the general unitary case, i.e. sit- 
uations where Si — dj — 1 for all i, j and therefore S = M and D — N < M. As a 
main result, we shall explain how this case can be recast in independent problems in- 
volving chains. Recall that as it was the case for chains, in such a framework, optimal 
transport plans are described by permutations. 

A consequence of the non-crossing rule 12.21 is usually called the local balance of 
supplies and demands: in the unitary case, there are as many supplies as demands 
between any two matched points pi,, and gj„ . We derive from this property a definition 
of chains in the case of unit masses. Given a supply point pi, define its left neighbor 
q'^ as the nearest demand point on the left of pi such that the numbers of supplies 
and demands in the interval {q'i,Pi) are equal; define the right neighbor q'{ oi pi in a 
similar way. Furthermore define left and right neighbors of a demand point qj to be 
the supply points that have qj as their right and left neighbor, respectively. Iterating 
this procedure gives raise to a chain. 

Definition 4.1 (unitary case). A chain in P U Q is a maximal alternating 
sequence of supplies and demands of one of the forms 

1. (Ki,gji,...,p^,,gjJ, 

2. {qj^,p,„...,qj^,p^^), 

3- {Pin qjn ■ ■ ■ T qjh-nPih) ! 

with k>\ and such that each pair of consecutive points in the sequence is made of a 
point and its right neighbor. 

Examples of chains are shown in Figure HHJ Observe that because of Case ([3]), 
some chains can be composed of only one (unmatched) supply, and no demand. 

Because of the local balance property, matching in an optimal plan can occur 
only between points that belong to the same chain. Indeed, an extension of the proof 
of Lemma 3 of [1] shows that the family of chains forms a partition of P U Q. As a 
consequence, each chain is preserved by an optimal transport plan. For example, if a 
chain is composed of a single supply, it cannot be matched in any optimal transport 
plan and can thus be dismissed from the problem. In summary, the general unitary 
problem can be decomposed into independent problems that only deal with chains, 
and then apply the results of Section [3l 

Note finally that the construction of the set of chains only depends on relative 
positions of supplies and demands and does not involve any evaluation of the cost 
function. The exact construction is not described here because it is subsumed by the 
algorithm presented in subsection 15.21 
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Fig. 4.1. Example of a problem containing two chains. Top: chains represented as collections 
of dashed arcs. Bottom: chains represented as dashed lines connecting elements of mass that are 
left and right neighbors (cf Fiaure \5. IV . 

5. Non-unitary case. 

5.1. Chains in real-valued histograms. In this case the notions of right and 
left neighbors should be defined for infinitesimal elements of supply and demand. 
The corresponding definition may be given in purely intrinsic terms, but the following 
graphical representation makes it more evident. 

Consider the signed measure J2i SiSp. — J2j dj5q- on the real line, where 5x is 
a unit Dirac mass at x. Plot its cumulative distribution function F, whose graph 
has an upward jump at each pi and a downward jumps at each g^, and augment it 
with vertical segments to make the graph into a continuous curve (Figure 15.11 cf also 
Figure 14. ip . Thus, e.g., the segment corresponding to a supply point pi connects 
the points of the graph with coordinates (pi, F{pi)) and {pi, F{pi + 0) = F{pi) + Si) 
(assuming left continuity of F). Here and below in figures similar to Figure lOl vertical 
segments corresponding to supply points are plotted in red and those corresponding 
to demand points in blue (color online). 

Infinitesimal elements of supply and demand are pairs of the form {pi,y') with 
F{pi) <y'< F{p, + 0) and iqj,y") with F{q, + 0) < y" < F{q,). Geometrically a 
supply element {pi, y') (demand element (gj, y")) corresponds to the point (pi, y') (re- 
spectively, {qj , y" ) ) in the vertical segment corresponding to the supply pi (demand qj ) 
in the graph of the cumulative distribution function F (see Figure [STTjl . 

For an infinitesimal element of supply {pi, y) define 

r(pi,y) = min{gj £ Q : q^ > Pi,F{q.i + 0) < y < F{qj)}, 
^{Pt, y) = max{(jj £Q: qj < p„ F{qj + 0) < y < F{qj)} 

(with the usual convention min0 = oo, max0 = — oo) and call the mass ele- 
ments {r{pi, y), y) and {i{pi, y), y) respectively the right neighbor and the left neighbor 
of {pi,y) if r{pi,y) and i{pi,y) are finite. The definition of right and left neigh- 
bors is then extended to elements of demand by defining r{qj,y) = pi whenever 
qj = i{pi,y) > — oo and £{qj,y) = pi whenever qj = r{pi,y) < oo. Inspection of 
Figure [5T] should make these definitions clear. 

Definition 5.1 (real- valued case). A chain is a sequence of elements of mass 
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Fig. 5.1. Example of construction of chains for a problem with general masses (color online). 
Red points and blue crosses mark the values of the cumulative distribution function F at supply 
points Pi and demand points qj according to the convention of left continuity. 

Small white circles represent a pair of neighboring demand and supply elements. Chains con- 
necting some neighboring mass elements are shown with dashed lines. All chains have the same 
structure in each horizontal stratum delimited with dotted lines. Capacity m^. := yk—i — yk of 
stratum k measures the amount of mass exchanged in that stratum. For example, the subsegment 
denoted /x^^j (resp. lJLf\) represents the share of supply located at pi (resp. of demand located at qn) 
that participates in the mass exchange in stratum 2 (resp. 3j. Detailed explanations are given in the 
text. 

Observe that the problem is unbalanced, and chains in strata 5 and 6 have three supplies and 
two demands. 



that has one of the forms 

1- {{Pii,y), {<lji,y),---APik^y)^ iUk^y)) ^(Ki,2/) = -oo, r{qj^,y) = oo; 

2- ((gjo,2/),(p»i,2/),...,(gj,_i,2/), fe,,2/)) mth e(qj„,y) = -oo, r{p,^,y) = 00; 

3- {{Pti,y). ('7ji,2/), • ■ fe-i,2/).{Pife,2/)) with i{pi^,y) = -00, r{p,^,y) = 00. 
Here k > 1 and qj„^_i = £{pi^^,y) > -co, qj^ = r{pi^^,y) < 00 for all m between 1 
and k except the cases specified above. 

Note that chains have similar structure inside strata defined in the above graphical 
representation as bands separated by horizontal lines corresponding to ordinates from 
the set {F{pi ± 0), . . . , F{pM ± 0), ± 0), . . . , F{qN ± 0)}: within each stratum all 
left and right neighbors are the same and only the y parameters differ. 

5.2. Data structure and algorithm for computing chains. We now de- 
scribe how to efficiently compute and store the structure of chains and strata for a 
given histogram. This discussion applies both for the real and unitary case (the lat- 
ter is degenerate in that all elements of each supply and demand point belong to a 
single stratum, cf Figures and [STTjl . Our construction is an adaptation of that of 
Aggarwal et al [1] Section 3] with somewhat different terminology and notation. 

The basic storage structure can be described as follows. Observe that for a supply 
point Pi the function r[pi, ■) is piecewise constant and right continuous on the segment 
[F{pi),F{pi + 0)]. For each p., build a list consisting of triples {Pi,y[^rn^r{pi,y[^)) in 
the increasing order of m > 0, where q = F{pi) and y[„i corresponds to mth 
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Fig. 5.2. Lists £ (solid arrows) and Co (dashed arrows) encoding the structure of the histogram 
from Fiaure V5.1\ See explanations in the text. 



jump of r{j)i, •) as the second argument increases. For a demand point {q.j,d.j) build 
a similar list of triples iqj,yj[rm''^iljjyj,m)) where j/"q — F{qj) and decreases 
with TO. Finally build a list C as concatenation of these lists for all supply and 
demand points in P U Q in the increasing order of the abscissa. In Figure 15.21 which 
features the same histogram as Figure 15.11 the elements of the combined list £ are 
represented with thick solid arrows. Their order corresponds to traversing the p^'s 
and (7j's left to right and for each of these points, to listing the right neighbors in 
the increasing order of y for pi and in the decreasing order of y for g^: in short, to 
traversing the continuous broken line formed by the graph of F together with the red 
and blue vertical segments. 

Note that all elements in C that start with pi have one of the two following forms: 
{Pi,F{pi),qj) with qj = r{p,,F{pi)) or {pi,F{qj + 0),qj) with pi = e{qj,F{qj + 0)). 
Similarly, elements starting with qj have either the form {qj , F{qj),pi) with pi = 
r{qj,F{qj)) or {qj,F{pi + 0),Pi) with qj = £{pi,F{pi + 0)). Therefore all elements 
of C involve one of the values F{pi ± 0) or F{qj ± 0) and hence C has at most 
2{M + N) elements. To see this refer to Figure and observe, e.g., that the function 
r{pi, •) has a jump at y only when, during the upward scan of the vertical segment 
corresponding to supply pi, one encounters on the right the bottom end of a vertical 
segment corresponding to qj = r{pi, y) (i.e., the point with y = F{qj + 0)). A similar 
observation holds for downward scan of segments corresponding to demand elements. 

The list £ can be regarded as a "dictionary" that allows to look up the right neigh- 
bor of any supply element (p, y) or demand element (g, y). To do this, e.g., for (p, y), 
locate in C an element (p, y) immediately preceding {p, y) and return the element 
(j'{p,y),y). Again, inspection of Figure \5l2\ should convince the reader that this pro- 
cedure is correct. Note that the search operation in an ordered list of length 0{AI +N) 
requires an 0(log(M -I- N)) number of comparisons. 

The list £ can be built in a linear number of operations 0{M + N) using the 
following algorithm. Here Sp, Sq are stacks storing pairs of the form {r,X) where 
r e P U Q and X e M. 
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Algorithm 2. 

• Set Sp ^0,Sq^0,listC^0,f^S-D,p^ maxP, q maxQ; 

• loop A: 

— if p = —00 and q = — oo then break loop A; 

— else if p > q then 

* set s ^ supply value of p, P P\ {p}; 

* loop B: 

■ if Sq — then prepend {p, f — s, oo) to L and break loop B; 

■ pop the pair {q' , f) from stack Sq; 

■ iff < f — s thenprepend {p, f — s,q') to C, push the pair {q' , f) 
on stack Sq if f < f — s, and break loop B; 

■ else prepend (p, /', q') to C; 

* repeat loop B; 

* push the pair {p, /) on stack Sp and set f ^ f — s, p ^ max P; 

— else if p < q then 

* set d^ demand value of q, Q ^ Q \ {q}; 

* loop C: 

■ if Sp = then prepend {q, f + d, oo) to L and break loop C; 

■ pop the pair {p', f) from stack Sp; 

■ if f > f + d then prepend {q, f + d,p') to L, push {p', /') on 
stack Sp if f > f + d, and break loop C; 

■ else prepend {q, f',p') to C; 

* repeat loop C; 

* push the pair {q, f) on stack Sq and set f ^ f + d, q <— max Q; 

— end if; 

• repeat loop A; 

• stop. 

Observe that if / is initialized with S — D = F{oo), then at the exit of loop A it 
will contain F(— oo) = 0. However it is possible to initialize / with any other value, 
e.g. 0, in which case its exit value will be smaller exactly by the amount S — D. It is 
therefore not necessary to compute this quantity beforehand. 

To find leftmost mass elements of chains we also need a list Co of a similar 
format that stores "right neighbors of — oo." To build this list, a variant of the above 
procedure is used. While the list £ was built by "prepending" elements, i.e., adding 
them in front of the list, the following algorithm uses both prepending and appending, 
i.e. adding new elements at the end of the list. The stacks Sp, Sq and the variable 
/ are assumed to be in the same state as at the end of loop A, in particular the 
stacks contain exactly those p and q points whose corresponding vertical segments are 
"visible from — oo." 

Algorithm 3. 

• Set lists £o ^ 0, jC' ^ 0, £" ^ 0; 

• repeat until Sq ^ 0: 

— pop the pair {q' , f) from stack Sq and append {—oo,f',q') to £" ; 

• repeat until Sp ^ 0: 

— pop the pair {p', f) from stack Sp and prepend (— oo, f',p') to C ; 

• if C — then 

— set (— oo, g', /') ^ the first element of L" and append (— oo, /, q') to Cq; 

• else 

— set (— oo,p',/') the last element of £'; 
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— if C" = then append (— oo,/,p') to Cq; 

— else 

* set (—00,(7',/') the first element of C" ; 

* append (— 00, /, min{p', g'}) to Cq; 

— end if; 

• end if; 

• set Co concatenation of C , Cq and C" . 

Finally the list £ is scanned and the values F{pi ± 0), F{qj ± 0), which appear 
as second elements of its constituent triples and define locations of the dotted lines 
separating strata, are sorted in decreasing order to give the sequence 

yo > yi > ■ ■ ■ > Vk, 

where K is the number of strata, k-th stratum by definition lies between yt-i and yk, 
a,nd 1 < K < M + N {K = M + N = 8 inthe example of Figures EUl [Sisi). This is 
the only stage in the process of building the data structure that requires a superlinear 
number of operations, namely 0((M + N) log(A/ + N)). 

5.3. Chain decomposition of transport optimization. Observe that the 
initial transport optimization problem can be replaced with a problem of transport- 
ing the Lebesgue measure supported on "red" vertical segments (representing supply) 
to the Lebesgue measure supported on their "blue" counterparts (representing de- 
mand). The cost function c in the new problem is defined for all points of these 
vertical segments, i.e., mass elements, but depends only on their horizontal coordi- 
nates: c{pt,y',qj,y") = c{p^,qj). 

Define the capacity of fc-th stratum as = yk~i ~ yk and the share of supply pt 
(demand Qj) in stratum k as ^[^^ = ruk if F{pi) < yk < yk-i < F{pi + 0) (respectively, 

^i'fc ~ "^'^ if ^(^i) ^ yk-i > yk > F{qi+0)) and otherwise. For the vertical segments 
representing supply and demand graphically, shares are equal to the lengths of their 
pieces contained between the dotted lines (Figure 15.11) ; we will use notation p,^f^ , /xj'] 

for these subsegments as well. Note that J2k f^'fk — (respectively, J2k — '^j)- 

Definition 5.1. For a given histogram with supplies (pi, Si) and demands {qj, dj) 
define a stratified transport plan as the set of nonnegative values {"fi^k;j,i), where 
1 < i < M, 1 < j < N , and 1 < k,£ < K, such that the following conditions are 
satisfied: 

^7i,fe;j,^ = Mj-'j for all j, ^liMj/ < for all i, k. (5.1) 

i,k 

Note that the numbers 

lij=^li,k;j,e (5.2) 

kj 

form an admissible transport plan (i.e., all conditions (|1.3p are satisfied). We will 
call this plan the projection of the stratified plan in question. The cost of a stratified 
transport plan is defined as f, j ^ c(j)i, qj) ji^k-j.e] of course it coincides with the cost 
of its projection. 

Conversely, let 7 = (7^), 1 < i < M, 1 < j < TV be an admissible transport 
plan; we call a stratified transport plan that satisfies (|5.2p a stratification of 7. Any 
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admissible transport plan admits a non-empty set of stratifications. Indeed, it is easy 
to check that e.g. for ^i^k-ji = lijfJ-i^kl^^j^I/^i^j conditions (15.11) - (|5.2I) are satisfied. 

We now prove that any optimal transport plan in the initial problem can be 
"lifted" to a bundle of disjoint transport plans operating in individual strata. There- 
fore to solve the transport optimization problem for histograms with general real val- 
ues of supply and demand, it suffices to split the problem into transportation problems 
inside strata, where they reduce to the unitary case because the mass exchanged in 
each stratum equals its capacity, and solve these problems one by one. 

Lemma 5.2. An optimal transport plan 7 admits a stratification {^i,k-j.e) that 
satisfies Ji,k;j,i — whenever (. ^ k 

Proof: Indeed, let {'^i,k;j,i) be any stratification of 7 and suppose that 7io,fco;io,^o > 
with i?o 7^ fco- Without loss of generality we restrict the argument to the casep^g < qj^. 

Suppose first that £0 > ^o, i-e., that the demand subsegment /^j^''^^ occupies a 

iv) 

lower stratum than the supply subsegment fJ^i^ kg- total supply located between 

these subsegments, i.e., the sum of all fJ'[^\ with k < kg and fi'fl with < pi < qj^, 
is then smaller than the total demand between these subsegments, i.e., the sum of all 
fi'^^'^j with Pig < qj < qjg and all ^^j^^^ with £ < £q. (From inspection of Figure [5TT] 
it should be easy to see that their difference is equal to J2ko<s<io ''^s, although we 
will not need this quantity here.) Since the first condition (15. ip must be fulfilled for 

all £, it follows that some demand share f-f^' located between and /^j^^^g in 

the just defined sense must be satisfied with supplies located outside. But this leads 
to crossing of the corresponding trajectories (c/ Lemma l2.2p . which implies that the 
total cost of the plan {ji^k;j,e) can be at least preserved, or even reduced, by a suitable 
rescheduling of mass elements. 

Now suppose that £0 < k^. This implies the existence of extra supply /i^f^^,/ be- 
tween Mj-^fcjj and iJ-^jl^i^^- If this supply share is matched, it has to feed some demand 
located outside, which again leads to crossing and can be ruled out just as above. 
If this supply share is not matched (which may happen in an unbalanced problem), 

then a nonzero part of the demand share lj!f^ can be rematched to this supply share, 
which is associated with the point pi' located closer to qj^ than pi^ , thus reducing the 
total cost. In all cases we have a contradiction with the original assumption. □ 



Note that strata are defined for piecewise constant cumulative distributions. A 
simple way to apply these results to continuous distributions of supplies and demands 
consists in approximate them by piecewise constant function using, e.g. quantization 
techniques. 

6. Practical considerations. In this section, we present some ways to optimize 
the use of the local matching indicators in Algorithm [T] 

6.1. Exposed points. Before applying Algorithm [1] one can detect possible 
unmatched points using the following result. 

Lemma 6.1 ("isolation rule"). Suppose that g is strictly monotone and that a 
point Pi of the unbalanced chain (j3.2l) is unmatched in an optimal transport plan. 
Then if i > 1 



c{pi,qi-i) > c(pi_i,gi_i) 
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and if i < N 



Proof: Suppose that a is optimal and assume for instance that i > 1 and c{pi, Qi-i) < 
c{pi-i , Qi^i) . Thanks to Lemma pi is not exposed, and consequently a^^{i — 1) < 
i- 1. Thus, c(pi,gi_i) < c(pi_i,gi„i) < c(po.-i(i-i), It is then cheaper to 

exclude Pa-^(i-i) and match pi with Qi-i, which contradicts the optimality of a. □ 



6.2. About the implementation and the complexity. The cost of the al- 
gorithm can be estimated through the number of additions and evaluations of the 
cost function that are required to terminate the algorithm. These operations are only 
carried out in Step [1] when computing the indicators. This section aims at giving 
details about efficient ways to implement this step and about the complexity of the 
resulting procedure. 

6.2.1. Implementation through a table of indicators. In this section, we 
define a table that collects the values of indicators and then describe a way to update 
it, when a negative indicator has been found. The aim of this structure is to avoid 
redundant computations. We present it in the balanced case (see (|3.1|) ). 

Consider a table of — 1 lines, where the fc-th line corresponds to the values of 
the indicators of order k: /^(l), /^(l), . . . , I^N -k- 1), (A^ -k). At the beginning 
of the algorithm, the table is empty and Step [T] consists in filling the line k of the 
table. Let us explain how to modify the table in case a negative indicator has been 
found. 

Following the assumptions of Theorem 13.21 consider the case where all the indi- 
cators that have been computed currently are positive except the last one. Suppose 
that this one is of the form J^^(io). According to Step HI fco pairs of supply and 
demand have to be matched and removed from the current list of points V. Note 
that the indicators that only deal with points in {pi,qi, . . . ,pig^i,qi^^i,pi^} or in 
{qi„+ko,Pio+ko+i,qio+ko+i^ ■ ■ ■ ^PN,qN} are not affected by this withdrawal, except 
that they may be renamed. Consequently, at an order k < fco, max(0, 2{iQ — fc — 1)) -t- 
max(0, 2(A^ — ia — ko — fc)) indicators' values are already known and in the line fc of 
the new table, 2(iV - fc) + 1 - max(0; 2{io - fc - 1)) - max(0; 2{N - - fco - fc)) 
values remain to be computed. In the case the first negative indicator is of the form 
J^^(io), a similar reasoning shows that in the line fc of the new table, 2{N — k) + 1 — 
max(0; 2(io — fc -I- 1) — 1) — max(0; 2(A^ — io — fco — fc) — 1) values remain to compute. 

6.2.2. Bounds for the complexity. In the vein of the previous section, we 
assume up to now that all the numerical values computed during the algorithm are 
saved. In this framework and as in any assignment problem, the number of evalua- 
tions of the cost function cannot exceed . 

The most favorable case consists in finding a negative indicator at each step of the 
loop. In this case, all points are removed through indicators of order 1. This case 
requires 0{N) additions and evaluations of the cost function. 

On the opposite, the worst case corresponds to the case where all the indicators are 
positive. In such a situation, no pairs are removed until the table is full. All possi- 
ble transport costs c{pi,qj) are computed. Consequently, this case requires -'^(-'^+^) 
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evaluations of the cost function. The number of additions is also bound by 0{N'^) as 
stated in the next theorem. 

Theorem 6.2. Denote by C^{N) the number of additions required to compute 
an optimal transport plan between N supplies and N demands with Algorithmic One 
has: 

C+{N) < 3A^2 - 6A^. 

The proof of this result is given in Appendix. 

In practice we observe that this upper bound is quite coarse, especially for small 
values of a. In order to better understand this behavior, we estimated the empirical 
complexity of our algorithm when N is increasing, for different values of a. For a 
fixed value of N, 100 samples of points are chosen randomly in [0, 1], and the mean 
of the number of additions and evaluations of g are computed. The results are shown 
in Figures 16.1116.21 as log-log graphs. Observe that the less concave the cost function 
is, the more accurate the bound 0{N'^) is. Conversely, when a tends towards 0, the 
complexity seems to get closer to a linear complexity. 




Fig. 6.1. Number of in-line additions with respect to the number of pairs, for various cost 
functions. The number a is the slope of the log-log graphs. In other words, C~^{N) ~ OIN"). 

In order to explain this reduced complexity when a decreases, we can notice that 
if the successive orders at which Step 4 of Algorithm [T] is visited are all bounded by 
AT, then (proof in Appendix) 

C+{N) < miK"^ + 2K + 2). 

Now, if we restrict ourselves to cost functions of the type c{x,y) ^ \x — with 
a G (0, 1], we also show (see Appendix) that local indicators of order 1 tend to be 
more easily negative when a decreases. More precisely, we show that for a given 
configuration of four points, the local indicator is either positive for all a, or there 
exists aa, which depends only on the configuration, such that the indicator is negative 
on (0, ao] and positive on [ao, 1]. As a consequence, the probability for an indicator 
of four points to be negative increases when a decreases. We conjecture that this 
last result remain true for all indicators (and we checked empirically that it is). If 
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Fig. 6.2. Number of in-line evaluations of g with respect to the number of pairs, for various 
cost functions. The number a is the slope of the log-log graphs. 



this conjecture turns out to be true, the smaller a is, the more probable it is for any 
indicator to be negative, and the more realistic it is that the bound K is small in 
comparison to N. This would explain the previous empirical results. 

6.3. Possible improvements. The use of Algorithms [TH3] enables to tackle 
transport problems involving real- valued histograms in 0{N^) operations. Neverthe- 
less, we emphasize that this complexity could be reduced since there is certainly room 
for improvement in the above algorithmic strategy. As an example, identical indica- 
tors may appear in different strata and should not be treated independently to save 
computational time. The investigation of the interplay between the strata remains 
for future assessment. 
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7. Appendix. 

Proof of Theorem 16.21 Before proving Theorem l6.2[ let us state some interme- 
diate results. In what follows, we denote by c'^{N) the number of additions required 
to achieve Step [T] of the algorithm for an arbitrary value of k. 

Lemma 7.1. Keeping the previous notations, we have: 

c+(iV)<3(2(iV-fc)-l). (7.1) 



Proof: The proof of (|7.ip in the case fc = 1 is left as exercise for the reader. Suppose 
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that A: > 1. Consider for example /^(«) and recall that: 



k-l 



(7.2) 



i=0 



The first term of this formula does not require any addition and most of the other 
terms have already been computed during the previous steps. Indeed, the first sum 
has been computed to evaluate and the second one has been computed to 

evaluate I^-iii)- It remains to add c{pi-i.k,qi+k-i) to it to compute the last sum 
of (|7.2p . Since at given order k at most 2 (TV ~ k) ~1 indicators have to be computed, 
the result follows. □ 

We now consider the number of operations required between the beginning of the 
algorithm and the first occurrence of Step [H 

Lemma 7.2. The operations required by the algorithm between its beginning and 
the first occurrence of Step^^can be achieved withtj^^{N) := 3fco(2iV— /cq — 2) additions, 
where /cq denote the current value of k when Step [7] occurs. 

Proof: Between the beginning of the algorithm and the first occurrence of Step HI only 
positive indicators have been computed, except for the current value of k — kg. This 
means that Step [T] has been carried out for k — 1, . . . , fco since the beginning. The cor- 
responding number of additions is bounded by X]fe=i '^ti-^)- Thanks to Lemma l7.1|, 
the result follows. □ 

Recall now that after Step S] has been achieved, the parameter k is set to 1. The 
previous arguments consequently apply to evaluate the number of additions between 
two occurrences of Step [H i.e. between two withdrawals. In this way, one finds that 
this number is bounded by £'^, {N'), where N' and /cq are the current values of N and 

k at the last occurrence of Step HI Note that it, (N') is a coarse upper bound because 
we are not considering the first occurrence of this step and a part of the indicators 
has already been computed as explained in Section 16.2.11 

We are now in position to prove Theorem 16.21 

Proof (of Theorem \6.2\) : Let ko,ki, . . . ,ks be the successive orders at which the Step [4] 
of the algorithm is visited. Observe that some of these numbers can be equal. Assume 
also that only one negative indicator was found at each of these orders, which is the 
worst case for complexity. As a consequence, X]i=o ~ ^'^^ number of 
additions required for the whole algorithm is lower than 



C+<^<(7V-2^^-)' 
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where is defined in Lemma |7.2[ Using Lemma |7.2[ we compute 
C+ < ^ 3fc.(2(iV - ^o) - fc« - 2) 

s — 1 i—1 5—1 

= ^ 3fc,(2(iV - ^ kj) -h-2) + 3ks{2{N - ^ kj) - k, ~ 2) 

s — 1 i—1 s — 1 s — 1 

= ^ 3fc.(2(iV - ^i) ~h-2) + 3{N-Y k,){N - ^ % ^ 2) 

s—1 2—1 s—1 s—1 

= 3Ar2 _ 6iV _ 6 ^ ^ fc.A:, - 3 ^ fcf + 3(^ fc,f 

= - 6N. 



□ 



Alternative complexity upper bound. Suppose that the first occurence of 
Step 4 is achieved at level kg, in 3fco(2A^ — /co " 2) additions. At this point, we remove 
2fco points in the total chain. Observe that the number of indicators of order k that 
have changed after this removal of 2ko points is at most 2fc + 1. Let ki be the next 
order at which Step 4 is visited. If the indicators computed during the first pass of 
the algorithm have been kept in memory, this means that the number of additions 
necessary in the second pass is smaller than 3 X;feLi(2A: + l) = 3(/fc? + 2/ci). This yields 
an alternative upper bound of the whole algorithm complexity 

s 

C+ < 3ko{2N -ko-2) + Y ^{kf + 2^0- (7-3) 

i=l 

If the successive orders at which Step 4 of the algorithm is visited are all bounded by 
then C+ < m{K^ + K + 2). 

Sign of indicators for costs \x — y\". Consider four consecutive points pi, qi, 
Pi+i, li+i in a chain. Assume without loss of generality that \qi+i — Pi\ = 1. Let 
a ^ \qi - Pi\, b = \pi+i - qi\ and c = {q^+i - p^+i], so 5 = 1 - a - c. Assume that 
b < min(a, c) and define 

/(a) =6" + l-a"-c". (7.4) 

It can be shown that iffe = l — a— c> ac, then / is positive and increasing on [0, 1] 
(this result can be seen as a refined version of the rule of three). Indeed, the derivative 
of / is 

/'(a) = \og{b)b" - \og{a)a" - log(c)c«. 

If 6 > ac, then log(&) > log(a) + log(c), which implies that 

/'(a) > log(a)(6" - a") + log(c)(&" - c") > 0. 

Since /(O) = 0, the result follows. As a consequence, for all costs of the form |a; — t/|", 
if & > ac, the indicator will be positive. 
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Now, assume that b=l — a— c<ac. In this case, the indicator /f (z) can be 
negative if a is small enough. Indeed, /(O) = and /'(O) < 0, which implies that / is 
negative in the right neighborhood of 0. Now, /(I) = 2 — 2a — 2c > 0, which means 
that the indicator is positive for a close to 1. 

Consequently, there exists ag such that /(ao) = 0. Moreover, we can assume that 
/'(ao) > 0, which means that log(6)6"° — log(a)a"'' — log(c)c"° > 0. Now consider 
a > Q!o. One has successively: 

/'(a) = log(6)6""6"-"° - log(a)a" - log(c)c" 

> (log(a)a"° + log(c)c"'')6"""° - log(a)a" - log(c)c" 
= (log(a)a"° + log(c)c"°)6"-"° - log(a)a" - log(c)c" 

= log(a)a"°(6"-"° - a"-"") + log(c)c"° (6"""° - c"-°'°) 

> 0. 

This implies that if an indicator or order 1 is negative for a given a in [0, 1], it will 
remain negative for smaller powers. 
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